BOOKKEEPER-968 Entry log flushes at configurable chunks by dlg99 · Pull Request #77 · apache/bookkeeper

dlg99 · 2016-11-11T08:14:24Z

With this patch one can configure interval (in bytes) for entry log to flush writes to the disk.

dlg99 · 2016-11-11T09:29:29Z

Failed test seems to be unrelated/flapper

merlimat · 2016-11-12T02:41:26Z

@dlg99 Have you considered the approach of using a rate limiter (eg: guava RateLimiter) with a configurable max number of MBytes/s to write when flushing?
It might be a bit inconvenient to properly tune for one's system, but setting the rate slightly below the disk write capacity should do the trick.

merlimat · 2016-11-12T02:43:29Z

On the other hand, we might even be able to do some auto-tuning: If I was only able to write let's say 200 MB in 1 second without the throttling. Next time I'll try to throttle at 190MB/s to not saturate the disk.

dlg99 · 2016-11-12T03:02:46Z

@merlimat limiting write rate won't help. One can write with relatively low write rate but since entry log only gets fileChannel.force() on log rotation we end up with a lot of writes cached in memory.
Once OS (when cache is too full or because of explicit fileChannel.force()) decide that it is time to flush the data to disk we have no control over the rate of writes. It does it at max possible speed, flushing 2048M on ssd with 400M/sec write throughput takes 5 sec if there are no other writes/reads.
I.e. in our production env I have configured flush to happen at every 10MB of entry log writes and so far I am seeing a lot less read latency spikes. I'll get some charts to show during meetup.

One can experiment with linux config (dirty_write_bytes / backround_dirty_write_bytes IIRC) but these are OS-wide setting (not per disk) and will affect other writes, i.e. I would not want to decrease these parameters to 10-50M for the rotational disk where we write application logs. Limiting these to the range of hundreds MB does not help much with the specific problem on hands.

jiazhai · 2016-11-16T10:08:36Z

@sijie , should we add items in doc/bookkeeperConfigParams.textile along with this changes?

In this link:
https://cwiki.apache.org/confluence/display/BOOKKEEPER/Contributing+to+BookKeeper
it said:
New server configuration settings should be added and documented in bookkeeper-server/conf/bk_server.conf, and also documented in doc/bookkeeperConfigParams.textile.

But seems, this textile file contains only the tip of the iceberg, a lot of config parameter is not added in. If needed, we may need a ticket to fix this.

merlimat

Change looks good to me.

sijie · 2016-11-21T21:22:30Z

+1 on what @jiazhai suggested. we should document all the new configurations.

jvrao · 2016-11-21T22:56:05Z

+1 on the patch.
@sijie we may need a jira item for what @jiazhai suggested.

sijie · 2016-11-21T23:52:16Z

@jvrao yup. another jira is fine. We need some jiras for documenting the settings introduced in 4.5.0 and mark them as blockers for 4.5.0 release.

sijie · 2016-11-21T23:52:23Z

+1 for this change

* changed all instances of `assert()` to junit versions in `src/test` Here is the script I used to find all instances of `assert()` inside the `src/test` folder: ``` grep -r "assert(" * 2>/dev/null | grep -v "main" ``` The `grep -v "main"` removes all instances of the usage within the main source tree (which there are quite a few). I did this as I assumed the JIRA ticket spirit was not to remove those instances from the main tree and thus would require including `junit` in core compilation rather than scoped for `test` as is. Author: Brennon York <brennon.york@capitalone.com> Reviewers: Sijie Guo <sijie@apache.org> Closes apache#77 from brennonyork/DL-122

@bug

) * (@bug W-4698540@) Increase openLedgerRereplicationGracePeriod openLedgerRereplicationGracePeriod defines how long the replication worker waits to fence an active extent before starting replication. The default value is 30 sec. Which should be enough for the writer to detect the bookie down incident and make an ensemble change. But given that we don't have back-pressure, sometimes client and bookies get backed up and may take longer time to detect and change the ensemble. Changing this value to 10 mins. Making this longer should not have any major impact as the writer is confirmed to everything it wrote. Also now we have explicitLAC so even delaying the fencing won't stop/restrict reader reading even the last entry/fragment Signed-off-by: Venkateswararao Jujjuri (JV) <vjujjuri@salesforce.com> * Applied review comments

BOOKKEEPER-968 Entry log flushes at configurable chunks

8710bf8

merlimat approved these changes Nov 16, 2016

View reviewed changes

asfgit closed this in 26b09ab Jan 31, 2017

dlg99 deleted the task/entry-log-flush branch October 14, 2021 23:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BOOKKEEPER-968 Entry log flushes at configurable chunks#77

BOOKKEEPER-968 Entry log flushes at configurable chunks#77
dlg99 wants to merge 1 commit into
apache:masterfrom
dlg99:task/entry-log-flush

dlg99 commented Nov 11, 2016

Uh oh!

dlg99 commented Nov 11, 2016

Uh oh!

merlimat commented Nov 12, 2016

Uh oh!

merlimat commented Nov 12, 2016

Uh oh!

dlg99 commented Nov 12, 2016

Uh oh!

jiazhai commented Nov 16, 2016 •

edited

Loading

Uh oh!

merlimat left a comment

Uh oh!

sijie commented Nov 21, 2016

Uh oh!

jvrao commented Nov 21, 2016

Uh oh!

sijie commented Nov 21, 2016

Uh oh!

sijie commented Nov 21, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

dlg99 commented Nov 11, 2016

Uh oh!

dlg99 commented Nov 11, 2016

Uh oh!

merlimat commented Nov 12, 2016

Uh oh!

merlimat commented Nov 12, 2016

Uh oh!

dlg99 commented Nov 12, 2016

Uh oh!

jiazhai commented Nov 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

merlimat left a comment

Choose a reason for hiding this comment

Uh oh!

sijie commented Nov 21, 2016

Uh oh!

jvrao commented Nov 21, 2016

Uh oh!

sijie commented Nov 21, 2016

Uh oh!

sijie commented Nov 21, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jiazhai commented Nov 16, 2016 •

edited

Loading